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£N| ■ Abstract 



Statistical equilibrium models of coherent structures in two-dimensional and barotropic 
quasi-geostrophic turbulence are formulated using canonical and microcanonical en- 
sembles, and the equivalence or nonequivalence of ensembles is investigated for these 
models. The main results show that models in which the global invariants are treated 
microcanonically give richer families of equilibria than models in which they are 
treated canonically. Such global invariants are those conserved quantities for ideal 
dynamics which depend on the large scales of the motion; they include the total 
energy and circulation. For each model a variational principle that characterizes its 
equilibrium states is derived by invoking large deviations techniques to evaluate the 
continuum limit of the probabilistic lattice model. An analysis of the two different 
variational principles resulting from the canonical and microcanonical ensembles re- 
veals that their equilibrium states coincide only when the microcanonical entropy 
function is concave. These variational principles also furnish Lyapunov functionals 
from which the nonlinear stability of the mean flows can be deduced. While in the 
canonical model the well-known Arnold stability theorems are reproduced, in the 
microcanonical model more refined theorems are obtained which extend known sta- 
bility criteria when the microcanonical and canonical ensembles are not equivalent. 
A numerical example pertaining to geostrophic turbulence over topography in a zonal 
channel is included to illustrate the general results. 

Keywords: Statistical equilibria; Mean-field theory; Nonlinear stability; Geostrophic 
turbulence 
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1 Introduction 



A prominent feature of two-dimensional turbulence is the formation of large-scale coherent 
structures among the small-scale fluctuations of the vorticity field |31j, |47| . This self- 



organization behavior results from the conservation of both energy and enstrophy (the 
spatial second moment of vorticity) in inviscid, incompressible two-dimensional flow, which 
causes a net flux of energy toward large scales and a net flux of enstrophy toward small 



scales | 3~0fl . As a consequence, a freely-evolving flow gradually tends toward an equilibrium 



state consisting of a stable, steady flow on the large scales and disorganized motions on the 
small scales. This generic behavior is confirmed by numerical simulations of high Reynolds' 
number flows in various settings. For instance, a freely-decaying flow with doubly-periodic 
boundary conditions relaxes at long times to either a coherent dipole vortex or double 
shear layer |39|, f|8| . Similarly, a weakly driven and dissipated flow is well approximated by 
a nearly steady coherent structure on the large scales that changes slowly in response to 



the driving and dissipation [^, [23[ . 

Quasi-geostrophic turbulence behaves in a similar fashion, producing coherent struc- 
tures on the large scales of motion within a potential vorticity field that is turbulent on 



a range of small scales [43 . In a geophysical context such as the active weather layer on 



Jupiter, robust mean flows of this kind are observed in the form of persistent jets and spots 



34fl . Numerous, but less obvious, examples of long-lived mean flows with these general 
characteristics are also found in the Earth's oceans and atmosphere j4T|. Generically, these 
coherent structures are shear flows or distributed vortices embedded in shear flows. 

In this paper we study a statistical equilibrium theory of coherent structures in two- 
dimensional or barotropic quasi-geostrophic turbulence. Several theories of this kind have 
been proposed and their predictions have been analyzed in some detail; they include the 
Onsager- Joyce- Montgomery theory of a point vortex gas [28], |29|, ||, the Kraichnan 
energy-enstrophy theory [|6|, [24], [|, and the Miller- Robert theory of a continuum vor- 
ticity field |36], |37|, |44|, fi5| . A review and critique of these various theories is given in ||9| . 
In that work it is shown that each of these theories relies upon some explicit or implicit 
assumptions concerning the form of the random vorticity field on the microscopic scale and 
that these different assumptions lead to different predictions about the coherent structure 
on the macroscopic scale. These differences stem from the way in which the generalized 
enstrophy invariants (the spatial higher moments of vorticity) are included in the various 
theoretical models. Unlike the global invariants associated with the conservation of energy 
and circulation, which are "rugged" invariants that depend on the large scales of motion, 
the generalized enstrophy invariants are "fragile" in the sense that they are sensitive to the 
vorticity fluctuations on the small scales. 

In the present paper we therefore consider a model in which the fragile invariants are 
replaced by a given probability distribution on the small-scale vorticity fluctuations, which 
we call the prior distribution. With respect to this underlying probabilistic description of 
the vorticity field, we then impose the rugged global invariants on the statistical equilibrium 
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measure that defines the model. In this fashion we obtain a theory in which a single 
prior distribution captures the microscopic effects and a few global invariants control the 
macroscopic features. 

Besides being more faithful to the continuum dynamics than the known theories, this 
model is more easily adapted to realistic physical situations. On the one hand, there can 
be practical advantages to having a model that utilizes only a few robust invariants, as 



has been demonstrated in [|32], |15| . On the other hand, a suitable prior distribution can 
be fit directly to the one-point vorticity statistics available from numerical simulations 
or physical data. Alternatively, it can be inferred indirectly by comparing the predicted 
vorticity-streamf unction profile with an observed profile. 

In the context of a model of this kind, we have the choice of building the equilibrium 
statistical measure from a canonical ensemble or from a microcanonical ensemble with 
respect to the rugged invariants. In most applications of statistical mechanics these alter- 
native formalisms define equivalent theories that have identical equilibrium states in the 
thermodynamic limit [^, It is rather surprising, therefore, to discover that in our models 
of coherent structures the two ensembles are not always equivalent. In fact, we find that 
there are regimes in which the equilibrium states for the microcanonical ensemble are en- 
tirely omitted by the canonical ensemble. Moreover, numerical computations based on the 
microcanoncial model show that these regimes often contain mean flows of great physical 
interest. In essence, the reason for this novel behavior lies in the character of the statisti- 
cal equilibrium models: they are local mean-field theories in which the continuum limit is 
nonextensive, the interactions are long-range, and the inverse temperature is negative. 

Given that some equilibrium states for microcanonical model are not realized by the 
corresponding canonical model, we are led to ask whether these most probable states cor- 
respond to stable flows. We answer this question in the affirmative by proving that all 
nondegenerate canonical and microcanonical equilibrium states define nonlinearly stable, 
steady mean flows. In the canonical model, these results reduce to the well-known the 
Arnold stability theorems, which rely on Lyapunov functionals constructed from the rugged 
invariants and the information (negative entropy) functional associated with the prior distri- 
bution [0, |33| . In the microcanonical model, however, these standard Lyapunov functionals 



are not positive definite at those equilibrium states which are not realized by the canonical 
model. In the nonequivalent case we instead use a new class of Lyapunov functionals to 
demonstrate the stability of the most probable flows for the microcanonical model. In this 
construction we introduce a penalization of the standard functional with respect to the 
microcanonical constraints that makes the resulting Lyapunov functional positive definite 
at the microcanonical equilibrium states. Such penalized functionals are identical with the 
so-called augmented Lagrangians used in methods for constrained optimization |5S| . 

These results support our contention that the natural formulation of a statistical equilib- 
rium model of coherent structures is the one in which conservation of generalized enstrophy 
is relegated to a prior distribution, and conservation of energy and circulation are imposed 
microcanonically. From a mathematical standpoint, this model is preferrable to the cor- 
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responding canonical model because, in general, its family of equilibrium states is richer. 
From a physical point of view, the microcanonical conditions are pertinent because the 
energy and circulation are trapped in the largest scales of motion, and hence these rugged 
invariants are isolated from interactions with larger systems or ignored degrees of freedom. 
Reciprocally, the use of a prior distribution on the vorticity, which amounts to a canon- 
ical treatment of the generalized enstrophy invariants, acknowledges that the statistical 
properties of the vorticity on the small scales are determined by contact with a bath of 
unresolved turbulent motions. Finally, our refined stability theorems ensure that the most 
probable flows defined by the model are nonlinearly stable for any admissible values of the 
microcanonical constraints, even when the classical sufficient conditions for stability are 
not satisfied. 

The paper is organized as follows. In Section 2 we formulate a general equilibrium 
statistical model that includes two-dimensional and barotropic geostrophic turbulence with 
topography. After explaining the role of the prior distribution in the probabilistic lattice 
model, we construct the canonical and microcanonical models, respectively. In Section 3 we 
then present the variational principles for these two models in the continuum limit as the 
lattice spacing tends to zero. Our analysis makes use of large deviation techniques, which 



are uniquely suited to derivations of this kind [18, [19|. In particular, we introduce a coarse- 



graining of the microscopic vorticity field and present the fundamental large deviation 
principle that this process satisfies. In another paper we state and prove a general theorem 
that contains this result as a special case |2l| . On the basis of this result, we develop 
the variational principles governing the equilibrium macrostates in the canonical and the 
microcanonical continuum models. We give the complete proofs of the large deviation 
estimates needed to justify these variational principles in a companion paper [ZIJ, where 
we treat an general class of models defined in terms of local mean-field interactions. In 
Section 4 we turn to the equivalence of ensembles questions, invoking ideas from convex 
analysis and constrained optimization theory to obtain sharp and complete results. A 
more general treatment of these issues is also presented in the companion paper [p0[| . In 
Section 5 we present the nonlinear stability theorems, first reviewing the known theorems 
that pertain to the canonical model and then developing the refinement of those theorems 
that applies to the microcanonical model. Finally, in Section 6 we display the results of 
some numerical solutions to the microcanonical variational principle for barotropic shear 
flows over a zonal topography. In this physically interesting problem the nonequivalence- 
of-ensembles behavior is quite conspicuous. 

Our presentation throughout this paper is a synthesis of physical modeling and math- 
ematical analysis, which is intended to focus on the conceptual aspects of the models we 
study. With this goal in mind, we omit many of the technical details and proofs, referring 
the reader to our other papers |5], |20], |21| for those aspects. 
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2 Formulation of the models 



2.1 Two-dimensional and geostrophic turbulence. For the microscopic dynamics 
that underlies our statistical equilibrium models we adopt an equation of motion that con- 
tains as special cases the governing equations for both purely two-dimensional turbulence 
and barotropic quasi-geostrophic turbulence. Namely, we consider the nonlinear advection 
equation 

dQ dQ_dij_ _ dQ_ch^ Q 
dt dxi dx 2 dx 2 dxi 

in which Q = Q(xi, x 2 ,t) and ip = ip{x\,x 2 ,t) are real scalar fields related by the elliptic 
equation 

Q = -A^ + r~ 2 ^ + b. (2) 

In this defining equation, A = d 2 ' jdx\ + d 2 jdx\ denotes the Laplacian on R 2 ; r is a given 
positive constant which may be infinity; and b = b(x±, x 2 ) is a specified continuous function. 
The flow velocity field v is nondivergent and is determined from the streamfunction ip by 
v = (dip/dx 2 ,—dif)/dxi). For the sake of definiteness, we take the flow domain to be a 
channel 

X = {x = (x u x 2 ) : |a;i | <4/2, \x 2 \ <£ 2 /2} (3) 

with a period length t\ and finite width l 2 . The boundary conditions for ideal flow in 
such a channel are achieved by setting xjj = on the walls x 2 = ±£ 2 /2 and by imposing 
£i-periodicity in x\. 

Equations (0)-(@) reduce to the Euler equations governing incompressible, inviscid flows 
in two dimensions when r = oo and 6 = 0. In this case Q coincides with the vorticity 
uj = dv 2 /dx\ — dv\/dx 2 . In such a flow the conservation of momentum is equivalent to 
the exact rearangement of vorticity uj under the area-preserving flow maps for the velocity 
field v induced instantaneously by u. 

When a finite r and a nonvanishing b are included in ([]])- (0), these general equations 
contain the standard equations governing a shallow rotating layer of homogeneous incom- 
pressible inviscid fluid in the limit of small Rossby number. In the geophysical literature 
where these equations are derived and discussed [fy]], the nondimensionalized spatial vari- 
ables {x\,x 2 ) are written as (x, y) and the geostrophic streamfunction ip is replaced by —ip, 
which also represents the nondimensionalized free-surface perturbation. Under appropriate 
quasi-geostrophic scalings and up to first-order in the Rossby number, the flow is nondi- 
vergent and its potential vorticity Q, defined by @, is advected by the flow according 
to ([]]). The inhomogeneous term in (Q) is given by b = f3y + h, where j3 is the gradient 
of the Coriolis paramter / = f(y) and h is the height of the bottom topography. The 
constant r in (0) is the Rossby deformation radius r = \J gH^j f , which is determined by 
the gravitational acceleration g, the mean fluid depth Ho, and a mean value fo- We refer 
the reader to the literature for a complete discussion of these fundamental equations and 



their properties [41]. 
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The general equations ([[])- (H) also contain the governing equations for the so-called 1-1/2 
layer model, in which a shallow upper layer lies on a deep lower layer of denser fluid whose 
motion is unaffected by that in the upper layer. Besides having oceanographic applications, 
this model is often used to describe the observed weather layer of the Jovian atmosphere 
|26| , |34| . In the applications to Jupiter, the lower layer flow is assumed to be steady, zonal 
and geostrophically balanced. Then the potential vorticity for the active upper layer is 
given by (0) with b = fly — r~ 2 ip 2 {y), where ip2 denotes the streamfunction for the flow in 
the lower layer. In this way, the deep flow produces an effective bottom topography. The 
appropriate Rossby scale r is determined as in the single layer model, except that a reduced 
gravity g 1 is used. With these choices, (]l|)-(@) govern the quasi-geostrophic dynamics of the 
shallow upper layer. 

^From the point of view of statistical equilibrium theory, the underlying continuum 
dynamics dictated by (|l])-(@) serves as a mechanism for mixing the scalar field Q subject 
to the constraints imposed by the various conserved quantities for that dynamics. Indeed, 
the equilibrium statistical models that we study are constructed by postulating that the 
underlying dynamics is ergodic with respect to the ideal invariants. This ergodic hypothesis 
is not expected to be universally valid. Nevertheless, numerous observations and simula- 
tions of two-dimensional and geostrophic turbulence show that typically the self-induced 
straining of the advected field Q leads to an effective randomization of Q. For instance, in 
a free evolution from a generic smooth field Q°, Q develops local finite-amplitude fluctua- 
tions on a range of small scales as time progresses. This behavior is related to the direct 
cascade of enstrophy to small scales. At the same time, Q tends to organize into coherent 
vortices at the large scales, and these vortices gradually merge into a final steady state. 
This dual behavior is associated with the inverse cascade of energy to large scales. The 
goal of the statistical equilibrium models is to characterize the typical steady mean flows 
that persist on the large scales without resolving the small scales of motion. The validity 
of these models must be checked a posteriori from their predictions, since a priori tests or 
proofs of the ergodic hypothesis are generally not feasible. 

The conserved quantities associated with (|l|)-(0) are the total energy H, the total cir- 
culation C, and a family of generalized enstrophies A, given by, 



H 



1 

2 J.v 



' dip 



+ 



' dip 
dx 2 



A 



+ r~ 2 ip 2 



C = I [Q-b]dx. 



x 



a(Q) dx . 



dx . 



(4) 



(5) 
(6) 



where a is an arbitrary, sufficiently smooth, real function on the range of Q. In addition, the 
Xi-component of linear impulse (momentum) M is also conserved in the channel geometry 
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that we consider; it is given by 



M = f x 2 [Q-b}dx. (7) 

We note that the expression Q — b = ( + r~ 2r i\) appearing in the circulation and impulse 
integrals is a sum of the relative vorticity, ( = — Aip, and the vortex stretching term, r~ 2 ip, 
due to deformation of the free-surface. 

While each of these quantities is precisely conserved by the continuum dynamics, the 
role that H, C and M play in the statistical equilibrium models differs dramatically from 
that played by the nonlinear enstrophies A. This crucial difference is a consequence of 
the fact that the generalized enstrophy invariants are sensitive to the small-scale structure 
of Q, while the energy, circulation, and impulse invariants depend on the large scales of 
motion. For this reason, in the following two subsections we formulate the various models 
by first defining the probabilistic structure of the small scales and then introducing the 
conditioning determined by the global invariants for the large scales. 



2.2 Generalized enstrophies and small-scale fluctuations. In order to define our 
continuum models, we first replace the infinite dimensional phase space of continuum vor- 
ticity fields Q by a sequence of finite dimensional phase spaces and then take an appro- 
priate continuum limit. To this end, we introduce a lattice C on the domain X having n 
sites and construct a probabilistic lattice model for each n. It suffices to use a uniform 
intersite spacing in both the x\ and x 2 directions; say, a dyadic partition of the inter- 
vals -V 2 < x i < V 2 and -V 2 < x 2 < V 2 into 2 mi and 2" t2 equal parts, so that 
n = 2 mi+m2 . The domain X then consists of the disjoint union of n microcells M(s) in- 
dexed by the sites s in the lattice C. The phase space for the lattice model is the product 
space Q n = R n , the microstates in the lattice model being points in Q n . We identify 
these micrstates with vorticity fields Q that are piecewise-constant relative to £; that is, 
Q(x) = Q(s) for all x G M(s), s G C For the sake of simplicity, we shall use the same no- 
tation for the continuum field Q(x), x G X, governed by the underlying partial differential 
equations and the discretized field Q{s), s G C, in the lattice model. 

The small-scale fluctuations of the microstates in the lattice model are described by the 
product measure 

^n(dQ) = IIpW( s )) on (8) 

in which p(dy) is a given probability distribution on R. Here and throughout the paper, 
y denotes a real variable running over the range of Q. With respect to the probability 
distribution Il n the microscopic fields Q consist of n independent, identically distributed 
random variables over the n microcells in the lattice. We refer to the common distribution 
p as the prior distribution, signifying that it describes the statistical properties of the 
microstate Q before the conditioning due to the rugged invariants is imposed. 



7 



When p(dy) = e~ a ^dy, y e R, for some continuous function a on R, the product 
measure Il n in (^|) coincides with canonical Gibbs measure with respect to A = - a{Q{s)), 
which is the discetization of the generalized enstrophy integral A — J a(Q)dx. That is, 

Tl n (dQ) = e ~ nA ^ Q) J] dQ(s) = J] e" a(Q(s)) dQ(s) . 
sec see 

In light of this identity, the role of U n (dQ) in the lattice model is evident from the general 
principles of statistical mechanics: it is the most probable distribution on Q n with respect 
to the phase volume dQ = Y\dQ(s) that is consistent with the conservation of generalized 
enstrophy A n . Typically, this characterization of the canonical ensemble is justified by two 
dynamical properties: 1) the invariance under the phase flow of the phase volume dQ; 2) 
the dynamical invariance of the function A n . In the models we study, however, a lattice 
dynamics discretizing the underlying continuum dynamics for which these two properties 
hold is not known. Consequently, it is necessary to treat the construction of the product 
measure U n (dQ) as a modeling issue, justifying its choice on whatever theoretical results 
are available and whatever practical considerations are at hand. 

The principal reason for preferring the canonical ensemble U n (dQ) to the corresponding 
microcanonical ensemble is the sensitivity of the generalized enstrophies A to small-scale 
motions. In physical terms U n (dQ) describes a random field Q on the lattice L in which 
there is a coupling between the fluid motions on scales resolved by the lattice and the 
unresolved turbulence on smaller scales. As in standard statistical equilibrium theory, 
the canonical formulation is appropriate to a system coupled to a reservoir, or thermal 
bath [0, [|. The prior distribution p that parametrizes H n (dQ) is effectively a generalized 
inverse temperature for the potential vorticity fluctuations on the lattice microscale. By 
contrast, the microcanonical ensemble based on a (finite or infinite) family of generalized 
enstrophies A n enforces the exact conservation of each A n on the lattice, inhibiting the 
exchange of generalized enstrophy between the resolved scales and the unresolved scales. 
The well-known flux of enstrophy to small scales therefore invalidates the microcanonical 
formulation. 

Statistical equilibrium theories of the long-time average behavior of solutions to (|l|)- 



have tended to emphasize the microcanonical formulation. Originally, Miller [|36| , [37 



and Robert |44|, [|5| independently constructed a model by assuming that the exact rear- 
rangement of vorticity under the continuum dynamics is imitated on the lattice C by an 
unspecified lattice dynamics. Under this assumption all generalized enstrophies A n are 
exactly conserved in the lattice model. This approach produces a well-defined model in 
which the complete family of vorticity invariants is imposed microcanonically. Later, Turk- 
ington p9| , |5| criticized the assumption made in the Miller-Robert model and formulated 
a modification of it that is derived instead from the underlying exact continuum dynamics 
on X . In the Turkington model, the evolution of the continuum vorticity field is observed 
on the lattice £ by averaging over the scales smaller than the lattice microscale, and con- 
sequently the family of equality constraints on all generalized enstrophies imposed in the 
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Miller-Robert model is replaced by a weaker family of inequality constraints on all convex 
enstrophies. This approach results in a model that accounts for the partial loss of the 
nonlinear enstrophies to submicroscale fluctuations. Among statistical equilibrium models 
that associate a final coherent state with a given initial state this model is the most faithful 
to the underlying ideal continuum dynamics. 

For the reasons mentioned above, however, a canonical formulation with respect to 
the generalized enstrophies usually furnishes a more appropriate physical model than a 
microcanonical formulation. Moreover, the equilibrium equations for the Turkington model 
are isomorphic to those for the canonical model with a prior distribution (§) under the 
identification p(dy) = e~ a ^dy; in the microcanonical case the function a is determined by 
the Kuhn- Tucker multipliers for the family of convex enstrophy inequalities, while in the 
canonical case it is prescribed |49], || . Whether all microcanonical equilibria are realized as 



canonical equilibria is not known. 

In practical applications these statistical equilibrium models are used to produce families 
of most probable large-scale flows that coexist with other complex mechanisms influencing 
the small-scale motions. Under these circumstances the canonical ensemble (||) is often 
desirable because the prior distribution p can be used to model the one-point probability 
distribution of the vorticity fluctuations. On the other hand, the constraints on generalized 
enstrophies, or potential vorticity moments, are of dubious relevance in these realistic situa- 
tions. For instance, in two-dimensional turbulence with weak driving and small dissipation 
it is possible to invoke a statistical equilibrium model as an adiabatic approximation to the 
evolution of the large-scale structure |32|, |23], [HJ . In these applications only the lowest-order 
moments of vorticity are sufficiently robust to be retained in the model. Similarly, compar- 
isons with direct numerical simulations of freely-decaying turbulence show good agreement 
with the predictions of the model only when the higher-order moments of vorticity is al- 
tered to account for dissipation |J. These tests show that it is necessary to take a prior 
distribution that is compatible with the relaxed final state. In the context of geostrophic 
turbulence, the modeling of the turbulent small scales is further complicated by the possible 



effects of nonvanishing Rossby and Froude numbers |Ji2| . Given the asymptotic nature of 
the quasi-geostrophic equations themselves, it is reasonable to fit the prior distribution to 
available data. In Section 6, we briefly indicate how this empirical approach can be used 
to formulate a model of zonal jets in a Jovian atmosphere. 

For the purposes of our general discussion throughout Sections 3, 4 and 5, we let the 
prior distribution p be an arbitrary probability distribution on R subject only to the decay 
condition (|ITD, and we base all of our models on the canonical ensemble ([5]) parametrized by 
such p. This simple choice of the product measure U n (dQ) is natural in the context of sta- 
tistical equilibrium theory. Any better choice would require a new theory of the correlation 
structure of turbulent scales, derived presumably from nonequilibrium considerations. 

2.3 Global invariants and large-scale motions. The statistical equilibrium lattice 
models that we consider are constructed by imposing the global invariants H and C on the 
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product measure U n (dQ). In this construction we can consider either the canonical or the 
microcanonical ensemble with respect to these invariants. A main goal of this paper is to 
investigate the equivalence or nonequivalence of these two different ensembles. Accordingly, 
we now proceed to formulate these canonical and microcanonical models. 
The canonical model is defined by the Gibbs distribution 

PnAvidQ) = Z n (p,i)~ x exp(-nPH n (Q)- ni C n (Q))Tl n (dQ), (9) 

and is parametrized by (3, 7 G R, which play the roles of "inverse temperature" and "chem- 
ical potential," respectively. The partition function 

ZM-y) = I exp(-nf3H n (Q)-njC n (Q))U n (dQ) 

normalizes the probability distribution P n> p tl (dQ) on Q n . We use the traditional notation /3 
for inverse temperature even though this symbol overlaps with that used in the geophysical 
literature for the gradient of the Coriolis parameter; we expect that the distinction will be 
clear enough from context. 

The microcanonical model is defined by the conditional distribution 

P^ T {dQ) = n n { dQ I H n (Q) = E, C n (Q) = T } , (10) 

at given values E and V of the global invariants. For technical reasons, it is necessary 
to replace the exact equality H n = E in ( |10|) by a containment H n G [E — e, E + e] for 
a small finite e > and similarly for the exact equality C n = V . For the sake of clarity 
of exposition, however, we will ignore this technical point and set e = throughout our 
discussion, leaving the obvious adjustments to the reader. 

The functionals H n and C n in (0) and ( |10D are the lattice versions of the functionals H 
and C defined on the continuum field Q in (f|) and (H), respectively. H n and C n act on Q n 
by identifying each microstate Q G Q n with the corresponding piecewise-constant function 
Q G L 2 (X), and by evaluating the functionals H and C on that field; the corresponding 
solution t/> to (0) then determines H(Q). Some straightforward calculations show that they 
have the explicit expressions 

H n(Q) = ff EE^nM'MsM*') - ^E^WQW. 

Cn{Q) = —J2Q(s)-b( S ), 

sec 

where g n (s,s') is the average over M(s) x M(s') of the Green function g(x,x') defined 
by (—A + r~ 2 )g = 6(x — x'), and h n (s) is the average over M(s) of the solution h(x) to 
(— A + r~ 2 )/i = b(x); both g(x, x') and h{x) satisfy the boundary conditions on dX imposed 
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on if)(x). The lattice energy H n consists of a quadratic self-interaction term with a potential 
g n and a linear term involving h n that represents interaction with the bottom topography. 

It is important to note that the vortex self-interactions governed by H n are long-range, 
being determined essentially by the Green function g(x, x') for the partial differential opera- 
tor — A + r~ 2 on X . This property, combined with the form of the product prior distribution 
n n (ciQ), gives these statistical equilibrium models their character as local mean- field theo- 
ries. Moreover, the long-range interactions imply that the energy function H n is a rugged 
invariant, meaning that it is not sensitive to the small-scale structure of the vorticity field. 
Indeed, H n depends only on the local average of Q in a neighborhood of any point, and 
therefore it is well approximated a spatial coarse-graining of Q. The same properties are 
shared by C n because it is a linear function of Q. By contrast, as stressed in the preceding 
subsection, all nonlinear enstrophies A n are fragile invariants in the sense that they cannot 
be approximated by their values on a coarse-grained state. 

The canonical parameters (3 and 7 are scaled by a factor n in This scaling ensures 
that, in the continuum limit as n — > 00, the mean values (H n ) and (C n ) with respect to 
this canonical ensemble tend to finite limits, and that the variances of H n and C n around 
these mean values tend to zero. The canonical ensemble ([]) thus produces equilibrium 
states having finite total energy and total circulation in the continuum limit, and hence it 
is compatible with the microcanonical ensemble (|l^) in which E and Y are fixed and finite 
as n — > 00. We note that, while this scaling of the parameters determining the canonical 
ensemble is natural in these local mean-field models, it results in a nonextensive continuum 
limit that is different from the usual thermodynamic limit [[T^J . 

The linear impulse invariant M, which is associated with the translational symmetry 
of the channel domain, can also be included in either the canonical or the microcanonical 
ensembles. For the sake of clarity, however, we ignore it in our development. If M is treated 
canonically, then the energy function H n is simply replaced by H n + UM n , where (U, 0) is 
the velocity of a given uniform zonal flow. Alternatively, if M is imposed microcanonically, 
then U is determined implicitly. In either formulation, the analysis of the impulse constraint 
is the same as that of the circulation constraint, which is also linear in Q. 



3 Maximum entropy principles 

In this section we investigate the continuum limit of the canonical and microcanonical 
models constructed in the preceding section, and we thereby derive the maximum entropy 
principles which characterize the most probable states for those models. Our analysis 
of the continuum limit relies on the powerful methods of the theory of large deviations 
Hl8| , |12|| . First, we establish a large deviation principle for a certain coarse-graining of the 
potential vorticity field Q with respect to the product prior distribution H n (dQ). With this 
basic result in hand, we then analyze the canonical ensemble P n ,p,-y and the microcanonical 
ensemble -P,f' r , and establish large deviation principles for the coarse-grained field with 
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respect to each of these ensembles. In this way we obtain a variational characterization of 
the equilibrium macrostates for each model. 

3.1 Coarse-grained process. We now introduce a macroscopic description of the po- 
tential vorticity field that complements the microscopic description inherent in the lattice 
model. We take the space of macrostates q to be the Hilbert space L 2 (X) with the usual 
norm ||g|| 2 = J x q 2 dx. This natural and convenient choice requires us to impose a certain 
decay condition on the prior distribution p. Specifically, we assume that there exists 5 > 
such that 

p(dy) < oo. (11) 

Since this decay condition holds for most prior distributions of interest, including compactly 
supported and Gaussian distributions, we adopt it for the sake of simplicity throughout 
Sections 3, 4 and 5. In Section 6, however, we relax it for a particular prior distribution 
used in the numerical example. 

In order to establish the connection between the microscopic and macroscopic levels 
of description, we define a certain coarse-grained process as follows. Partition the domain 
X into n = 2 Tl+r2 macrocells Xj lt j 2 , with n <C mi, r^ <C m.2, and j\ = 1, . . . ,2 n , — 
1, . . . ,2 r2 . This partition represents a coarsening of the lattice £ that defines the phase 
space Q n ; each of the n macrocells Xj lt j 2 contains n/h sites of C Now, let Q n ,n be the 
L 2 (,Y)-valued stochastic process defined by averaging the random microstate Q over each 
macrocell; namely, 

Qn,n(x) = - 22 Q{s) for all x G X juj2 , (12) 

Clearly, Q n ^ is piecewise constant with respect to the partition of X into macrocells. The 
coarse-grained process Q n ,n takes values in the space of macrostates L 2 (X). 

In what follows we shall be interested in a double limit in which both n —>■ oo and 
h — > oo, with n/n — ► oo. We refer to this double limit as the continuum limit. In order 
to deduce the limiting behavior of Q Ut n under either the canonical ensemble @ or the 
microcanonical ensemble (|TU{), we first estabilish a basic theorem that describes its behavior 
with respect to U n (dQ). The formulation of this theorem requires some definitions, which 
we now state. 

Associated with the prior distribution p is its cumulant generating function 

f(r}) = log / exp(rjy) p(dy) (rj G R) . (13) 

J Ft 

In view of the decay condition (O), f{rj) is defined and continuous for all rj G R. Moreover, 
/ is convex function. The convex function i conjugate to /, namely, the Legendre-Fenchel 
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transform of /, is defined by 

i(y) = sup [rjy-f(r])] (y G R) . (14) 



n 



It is known that i achieves its unique minimum value of over R at y == f yp(dy). The 
reader is referred to fl2], |18| for these definitions and properties. 



In terms of these standard constructions, we define the information functional 

I(q) = [ i{q{x))dx (qeL 2 (X)). (15) 

In the terminology of large deviation theory, J is a convex rate function; that is, it is a con- 
vex, lower semi-continuous functional mapping L 2 (X) into the extended interval [0, +oo]. 
In fact, the information functional I is the rate function for the basic large deviation 
principle satisfied by the coarse-grained process Q n ,h with respect to the product prior 
distribution U n (dQ). In the following theorem we state a simplified version of this large 
deviation principle. In another paper |2~I|, we state and prove the general version. 



Theorem 1. For any Borel subset B of L 2 (X) that is a continuity set for the rate function 
/, the following double limit holds: 

lim lim -logU n {Q nfi G B} = -1(B), (16) 

n—too n->oo n 

where 1(B) = inf{/(g) : q G B}. 

Here, we use the notion of a continuity set B for / to assert simply a double limit 
rather than the standard pair of large deviation upper and lower bounds for closed sets B 



and open sets B, respectively. By a continuity set for the information functional (15) we 
mean any Borel set B c L 2 (X) with the property that I(B°) = 1(B), where B° is the 
interior of B and B is the closure of B. Under suitable conditions on p, such as (|ll|), the 
continuity sets of / are rich enough to encompass the sets that arise in practical applications 



of the result. The double limit (|1^) then conveys conceptually the content of the rigorous 
large deviation principle given in [ffiJl . The proof relies essentially on the classical Cramer 
theorem for sample means of independent and identically distributed random variables 



[12], pl|]. Roughly speaking, the theorem follows by applying Cramer's theorem to the local 



average that defines the coarse-grained process Q nj n over each macrocell Xj 1 j 2 , and then 
by integrating the results for each macrocell over the entire domain X . 

The asymptotic formula flTE| ) give an exponential-order corrections to the law of large 
numbers behavior of the coarse-grained process Qn,n- That is, finite departures of Q n ,n 
from its mean value, the constant y = f yp(dy), have exponentially small probability as 
n — > oo. If we take B = {q G L 2 (X) : ||g — y\\ > 5 > 0} in (|I~6|), then we have, for large n, 
n and n/h, 

n n {Q n ,n G B} < e~ nI ^' 2 ; 
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in the formula 1(B) > for any finite 8, while I(y) = 0. 

We may summarize the content of Theorem 1 in the formal asymptotic statement that, 
for any macrostate q G L 2 (X), 

n„{Qn,n ~ q] ~ e~ nI ( q > in the continuum limit. (17) 

Here, the symbol ~ means close in the strong topology of L 2 (X). The equivalence be- 
tween this formal statement and the precise result ( fL6|) can be seen by using balls B r (q) 
of arbitrarily small radius r centered at q, and the fact that I(q) = lim r ^o B[B r (qj). This 
asymptotic expression also provides the heuristic interpretation of the rate functional / 
as a negative entropy. Indeed, —I(q) quantifies the multiplicity of the microstates that 
correspond under the coarse-graining to a macrostate q. Equivalently, I(q) represents the 
information lost in going from the microscopic to the macroscopic level of description. 

3.2 Canonical model. We now turn to the analysis of the statistical equilibrium model 
governed by the canonical ensemble @. The following theorem characterizes the continuum 
limit for that model, using the asymptotics for the coarse-grained process Q n ,n- 

Theorem 2. With respect to the canonical ensemble P n ^ tl (dQ), the coarse-grained process 
Qn,n satisfies the double limit 

lim lim - log P nAl {Q nA G B} = -Ip a {B) , (18) 
for any Borel subset B of L 2 (X) that is a continuity set for Jg )7 ; in this formula, 

IpM = I(q) + PH(q) + 7 C(q) - 7 ) , (19) 

where 

$(/3, 7 ) = mm . [I(q)+(3H(q) + 1 C(q)} (20) 

qeL 2 (X) 

= - lim -logZ n (^,7) . 



The proof of this theorem is indicated in our companion paper [p0[| . The key idea is 
to represent the interaction functions H n and C n in the Gibbs measure (Q) in terms of 
the coarse-grained process and the corresponding continuum functionals H and C. This 
representation is provided by the following approximations 

H n (Q) = H(Q n ^) + o(l) , C n (Q) = C(Q n , n ) + o(l) , (21) 

in which the o(l) errors are uniformly small over Q G Q n . Here, and henceforth, we 
evaluate the functionals H and C defined in (||) and (||) on macrostates q G L 2 (X). The 
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streamf unction ip corresponding to any such q is the solution to — Aip + r~ 2 ip + b = q in X 
with appropriate boundary conditions on dX; that is, ip = G(q — b), where G denotes the 
Green operator for —A + r~ 2 : 

Gzix) = [ g(x,x')z(x')dx' (zeL 2 (X)). (22) 
Jx 

The approximations (^Tj) express the fundamental fact that the global invariants H and 
C are not sensitive to the small-scale fluctuations of the microstate Q, being almost un- 
changed by the local averaging that defines the coarse-grained process. The quadratic 
self-interaction term in H has this property because it is defined by the long-range interac- 
tion function g(x,x'). C and the term in H arising from interaction with the topography 
have this property because they are linear (afline). With the representations in hand, the 



large deviation limit (|i8l) and the limit in (p0|) can be established by general methods, 
namely, the Laplace method for the asymptotics of large deviation-type expectations. As 
the proofs are very similar to those already given in 0, we omit them here. 

^From the point of view of predicting the coherent states in a turbulent fluid, the 
essential content of the large deviation principle for the canonical ensemble lies in the 
canonical information functional According to (0), the most probable macrostates q 
are those at which Ip a {q) achieves its minimum value of 0. For this reason, we define the 
set of equilibrium states associated with given canonical parameters (3, 7 G R to be 

£ p>1 = {qeL 2 (X) : l^(q) = 0} = argmin [/ + j3H + 7 C] . (23) 

Any macrostate q that does not lie in has an exponentially small probability of being 
observed as a coarse-grained state in the continuum limit; indeed, for such a macrostate 
^8,7(9) — $ f° r some positive 5, and therefore the large deviation principle implies that for 
large n and n 



In light of this sharp estimate, we see that the equilibrium macrostates in Ep a are over- 
whelmingly most probable among all possible coarse-grained states of the turbulent system. 
Consequently, the main predictions of the statistical equilibrium theory in its canonical form 
are derived by solving the unconstrained minimization problem whose objective functional 
is / + (3H + jC. The existence of at least one equilibrium state q in £p n for each given 
(3, 7 6 R can be deduced readily by the direct methods of the calculus of variations. In gen- 
eral, £g i7 may contain more than one macrostate, in which case the statistical equilbrium 
model exhibits a phase transition. 

Let us now display the first-order conditions for the variational problem whose solutions 
are the equilibrium states in the canonical model. At a given solution q G £3,7, there holds 

= 6(I + f3H + jC)(q) (24) 

+ Pi> + 7] $q dx , 



x 
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where ip is the streamfunction corresponding to q, and Sq denotes a variation in L 2 (X). 
From this calculation we obtain the equilibrium equation i'(q) = —flip — 7, which we can 
express in the form 

q = -Ai> + r~ 2 ^ + b = f'(-p$->y). (25) 

The last expression uses the fact that, since / and i are conjugate convex functions, their 
first derivatives /' and i' are inverse functions. Thus, the statistical equilibrium model 
produces a semilinear elliptic equation for the streamfunction %p of the most probable flow. 
We shall refer to (|25|) as the mean- field equation. The predicted dependence /' of the mean 
potential vorticity on the mean streamfunction is determined solely by the statistical prop- 
erties of the small-scale fluctuations in the model, since the prior distribution p determines 
/ through (|13|). With a fixed prior distribution p, the branches of most probable, or coher- 
ent, states are parametrized by f3 and 7, which enter nonlinear ly in (p5|). The mean-field 
equation can possess nonunique solutions, and its solutions branches can bifurcate. 

Let us also record the second-order conditions at an equilibrium state q. With 5ip 
denoting the solution to (— A + r~ 2 ) 5ip = Sq under appropriate boundary conditions, there 
holds 



< 5 2 (I + (3H + 1 C)(q) (26) 

2 

I + r- 2 (5^) 2 



x 



i"{q){5qf + (3 




dx . 



This condition is equivalent to the nonnegative-definiteness of the bounded, symmetric 
operator i"(q) + (3G on L 2 (X), where i"(q) is a multiplication operator and G is the Green 
operator ( p2p . Both of these component operators are positive-definite. Consequently, 
the second-order conditions are automatically satisfied whenever (3 is positive. When f3 is 
negative, however, a critical point q satisfying the mean-field equations is not an equilibrium 
state unless the second variation of I + f3H is nonnegative-definite at q. Accordingly, the 
second-order conditions are crucial in the negative temperature regime, which is often the 
regime of most interest in the study of isolated coherent structures. Finally, we note that if 
the second variation is strictly positive-definite at q, then £g i7 = {q}, and the equilibrium 
is isolated and nondegenerate. Conversely, the degeneracy of the second variation signals 
the presence of a phase transition. 

3.3 Microcanonical model. In some respects the microcanonical ensemble (0) defines 
a more natural model than the corresponding canonical ensemble. From a physical point of 
view, the canonical parametrization of equilibrium states by an inverse temperature (3 and 
a chemical potential 7 is undesirable because the coherent structures are not maintained 
by contact with a bath having these parameters. Rather, the equilibrium states represent 
organized flows on the large scales which contain the energy E and circulation V and are 
isolated from the turbulent fluctuations on the small-scales. It is therefore reasonable to 
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parametrize such flows by E and T. From a mathematical standpoint, we are compelled 
to study the microcanonical formulation of the statistical equilibrium theory by virtue of 
the fact, which we establish in Section 4, that the microcanonical model is not always 
equivalent to the canonical model in the continuum limit. 

The following theorem characterizes the continuum limit for the microcanonical model 
in terms of the coarse-grained process Q n ,h- 

Theorem 3. With respect to the microcanonical ensemble P E ' r (dQ), the coarse-grained 
process Q n ,h satisfies the double limit 

lim lim -logP*' r {Q nfi eB} = -I E ' T (B), (27) 

n— >oo n-*oo n 

for any Borel subset B of L 2 (X) that is a continuity set for I E,r ; in this formula, 

lW(a) = { 1{q) + S{E ' r) if H{q) = E > C{q) = r ' (28) 

I +oo otherwise, 



where 



S{E,T) = -mm{ I{q) : H(q) = E, C(q)=T} (29) 
= lim -logU n {H n = E, C n = Y}. 

n— >oo n 



We reiterate our earlier remark that we have taken the microcanonical constraints to 
be exact equalities for the sake of clarity in the exposition. To obtain mathematically 
rigorous versions of these results, we first replace the microcanonical constraints by the 
containments H n G [E — e, E + e] and C n G [r — 5, T + 5} with finite e, 5 > 0, and we then 
take a third limit as e, 5 — > in ([27|) after the limits on n and n. 

This theorem is a simplified version of a general theorem that we formulate and prove in 
our companion paper f2"Uf . As in the analysis of canonical model, the representations ( JH| ) 
are fundamental to the proof. With these approximations and the basic large deviation 



principle (|iq) in hand, the large deviation principle fl27D can be deduced directly from the 
general results in f20|j . 

The large deviation principle for the microcanonical ensemble involves the microcanon- 
ical information functional I E,V . Among the macrostates lying on the microcanonical man- 
ifold H = E,C = T, the most probable macrostates q are those at which I E,V achieves its 
minimum value of 0. These macrostates compose the set of equilibrium states associated 
with given microcanonical parameters E > 0, T G R; namely, 

S E ' r = { q g L 2 (X) : I E ' T (q) = 0} = argmin{/ : H = E, C = T} (30) 
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As in the canonical model, any macrostate q that does not lie in the equilibrium set has 
an exponentially small probability of being observed as a coarse-grained state in the con- 
tinuum limit. Conversely, the equilibrium macrostates in £ E,r , which solve the constrained 
minimization problem with objective functional /, are the overwhelmingly most proba- 
ble coarse-grained states compatible with the microcanonical constraints H = E, C = T. 
Again, as in the canonical model, the existence of an equilibrium state q in £ E,r is ensured 
by direct methods. Since the equality constraint H = E makes the microcanonical manifold 
a nonconvex set, constrained minimizers may be nonunique, and hence £ E,V may contain 
multiple equilibrium macrostates. 

The first-order conditions for a microcanonical equilibrium q e £ E,V are identical to (24), 



except that (3 and 7 are Lagrange multipliers for the energy and circulation constraints, 
respectively. The solution triple (q, (3, 7) is determined, in principle, by the given constraint 
pair (E, T), since the multipliers are uniquely determined by the critical point q. Similarly, 
the mean-field equation ( p5|) holds without change in the microcanonical model, except 
that the parameters (3 and 7 appearing in it are also unknowns. 

The second-order conditions, on the other hand, are fundamentally altered by shifting 
from the canonical to microcanonical formulation. ^From general principles in optimization, 
we know that the nonnegativity condition (|26| ) at a constrained minimizer q holds for all 
variations Sq that are infmitesimally compatible with the constraints, but not necessarily for 
arbitrary variations 5q pTj , [51]]. Thus, we find that the second-order conditions appropriate 
to a macrostate q G £ E ' r are that (p6|) holds for all 5q satisfying the linearized side- 
conditions 

SH(q) = f ipdqdx = and 5C(q) = f dqdx = 0. (31) 
Jx Jx 

Given this characterization of the constrained minimizers of / subject to H = E and C = T, 
we see that set of microcanonical equilibria is potentially larger than the corresponding set 
of canonical equilibria. 

This difference between the canonical and microcanonical equilibrium equations at 
second-order underlies all of our subsequent development. Broadly speaking, it implies 
that families of microcanonical equilibria are richer than corresponding families of canoni- 
cal equilibria, and that nonlinear stability criteria based on the microcanonical formulation 
are finer than corresponding criteria for the canonical formulation. 



4 Equivalence and nonequivalence 

We now turn our attention to the relation between the equilibrium sets £g )7 for the canon- 
ical model and the equilibrium sets £ E)V for the microcanonical model. In most statistical 
equilibrium models, the canonical and microcanonical ensembles are equivalent, in the sense 
that there is a one-to-one correspondence between their equilibrium states. For the local 
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mean-field models of coherent structures in turbulence, however, there can be microcanon- 
ical equilibria that cannot be realized as canonical equilibria. Moreover, these equilibrium 
states are neither rare nor pathological. Rather, they are often the coherent mean flows 
of greatest physical interest. In the analysis to follow, we show how the properties of 
the thermodynamic functions in the microcanonical and canonical models determine the 
correspondence, or lack of correspondence, between equilibria for these two models. 

4.1 Thermodynamic functions. The fundamental thermodynamic function for the mi- 
crocanonical model is the value function S(E, T) in the constrained maximum entropy 
principle ( p9f) whose solutions constitute the equilibrium set £ E ' r . Similarly, the funda- 
mental thermodynamic function for the canonical model is the value function $(/?,7) in 
the free maximum entropy principle (|20|) whose solutions constitute the equilibrium set 
£g, 7 . These two functions are conjugate functions in the sense of convex analysis |27], [51f ; 
that is, they are related by the identity 

$(/3, 7 ) = M[f3E + jT-S(E,T)) (32) 

The proof simply amounts to writing the free minimization in (^) in terms of the con- 
strained minization in (^9[): 

mm\ I + (3H + 7 rl = inf min { I + (3H + 7 r : H = E, C = T } 

q 1 ' E,F q 1 J 

= M[(3E + jT-S(E,r)}. 

In other words, $ = S* is the Legendre-Fenchel transform of S. Consequently, $ is a 
concave function of 7), which runs over R 2 . By contrast, 5* itself is not necessarily 
concave. The concave hull of S is furnished by the conjugate function of $, namely, 
$* = S**, which satisfies the inequality 

S(E,T) < inf [/3£ + 7 r -$(/?, 7)] = S**(E,T). (33) 

The relation between microcanonical equilibria and canonical equilibria depends cru- 
cially on the concavity properties of the microcanonical entropy S. Henceforth, we shall 
consider the function S to be defined on a domain A, which we take to be the largest open 
subset of R 2 consisting of admissible constraint pairs (E, T) for the microcanonical model; 
such a constraint pair is admissible if (E,T) = (H(q),C(q)) for some q G L 2 (X) with 
I(q) < +00. We call this domain A the admissible set for the microcanonical model. Since, 
in general, S is not a concave function on A, we introduce the subset C C A on which the 
concave hull S** coincides with S; that is, (E,T) G C if and only if S**(E,T) = S(E,T). 
There is another equivalent definition of C. Namely, C consists of those points (E, T) G A 
for which there exists some (/?, 7) G R 2 such that 

s(e', r') < s(e, r) + (3{e' - e) + 7 (r' - r) (34) 
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for all (E', r') G A. This condition means that S has a supporting plane, with normal 
determined by ((3,j), at the point (E,T). Such points (E,T) are precisely those points of 
A at which S has a (nonempty) superdifferential, which is the set of all 7) for which 
the above condition holds |27], |5Ifl . 

4.2 Microcanonical and canonical equilibrium sets. The set C, which we call the 
concavity set, plays a pivotal role in the criteria for equivalence of ensembles. The following 
theorem gives results of this kind. 

Theorem 4. 

(a) If (E, r) G A belongs to C, then E E ? C £ A7 for some (/3, 7). 

(b) If (£,r) G .4 does not belong to C, then £ E > T ' p[£p }1 = for all (^,7). 

Proof, (a) The hypothesis means that equality holds in (|33D and is attained at some 
(/?, 7) for which 

S(E,V) = /3 J B + 7 r-$(/3,7)- 

To show the claimed containment, take any q G S E,T and note that H(q) = E, C(q) = T 
and I(q) = —S(E,T). Substitution of these expressions into the above equality yields 

I(q)+PH(q)+yC(q) = -S(E,T) + /3E + yT 

= $(/3, 7 ) = mm[I(q)+PH(q) +1 C(q)], 

using (pCf). Since consists of the minimizers of / + (3H + 7C, it follows that g G f/3 )7 . 
This completes the proof of (a). 

(b) A complementary argument to that used in (a) applies. Now, the hypothesis means 
that, for all (/?, 7), 

S{E,T) < /3 J B + 7 r-$(/3,7)- 

Then, any q G £ E,V satisfies 

J(g)+/3iJ(g)+7C(g) = -S(E,T) + (3E + yT 

> $(/3, 7 ) = mm{I(q) + f3H(q) + 1 C(q)]. 

Thus, g does not minimize / + [3H + 7C, and hence does not belong to £g i7 . Since 7) 
is arbitrary, this completes the proof of (b). 

^From Theorem 4 we see that, for constraint pairs in C, the microcanonical equilib- 
ria are contained in a corresponding canonical equilibrium set, while, for constraint pairs 
in A\C, the microcanonical equilibria are not contained in any canonical equilibrium set. 
Consequently, whenever C 7^ A the canonical equilibria do not exhaust the admissible mi- 
crocanonical constraint pairs, and there are microcanonical equilibria that are not realized 
by any canonical equilibria. On the other hand, all canonical equilibria are contained in 
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some microcanonical equilibrium set, and C is exhausted by the constraint pairs realized 
by all canonical equilibria. These further results are given in the following theorem. 

Theorem 5. 

(a) The concavity set C consists of all constraint pairs realized by the canonical equilibria; 
that is, 

C = \J{(H,C)(£ 0a ) : (P, 1 )ER 2 }. (35) 

(b) Each canonical equilibrium set £p^ consists of all microcanonical equilibria whose con- 
straint pairs are realized by £p,y] that is, for any (/3,j), 

&to = U { ^ ■ r) G (H, C)(£^) } . (36) 

Proof, (a) The containment of C in the union is immediate from Theorem 4a. To show 
the opposite containment we argue by contradiction, supposing that for some (/3, 7) and 
some q G Sp^, (E,T) = (H(q),C(q)) G A\C. Then, we find that 

s(E,r) < /3E + 7 r-$(/3,7) 

= -I(q) < S(E,T), 

using (pUD and fl29|) as in the proof of Theorem 4b. We thus obtain the desired contradiction. 
This completes the proof of (a). 

(b) The containment of £p a in the union is straightforward. Let q G and set 

E = H(q) and T = C(q). Then, I(q) + f3E + 'yT < I(q) + (3H{q) + 7 C(gj for all q. 
For those q which satisfy the constraints H(q) = E, C(q) = T, we therefore find that 
I(q) < I(q). Hence, q G £ E ' r . 

The opposite containment is also straightforward. If E = H(q) and T = C(q) for some 
q G £p,<y, then for any q G £ E ' T , we have I(q) < I(q). Since q G £p,y, we obtain 

wm[I{q)+PH{q)+>yC(q)] = I(q)+pE + jT 

> I(q)+PH(q)+ 7 C(q). 

Hence, q G £p tT This completes the proof of (b). 

Theorems 4 and 5 allow us to classify the microcanonical constraint parameters (E, T) 
according to whether or not equivalence of ensembles holds for those parameters. In fact, 
the admissible set A can be decomposed into three disjoint sets, where (1) there is a 
one-to-one correspondence between microcanonical and canonical equilibria, (2) there is 
a many-to-one correspondence from microcanonical equlibria to canonical equilibria, and 
(3) there is no correspondence. In order to simplify the precise statement of this result, 
let us assume that the microcanonical entropy S(E, T) is differentiable on its domain A. 
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Then, for each microcanonical parameter (E, T) G A there is a corresponding canonical 
parameter (/?, 7) determined locally by 

' dE' 7 ~ or ' 

Under this assumption, we have the following classification. 

1. Full equivalence. If (E, T) belongs to C and there is a unique point of contact between 
S and its supporting plane at (E, T), then £ E,V coincides with 

2. Partial equivalence. If (E, T) belongs to C but there is more than one point of contact 
between S and its supporting plane at (E, T), then £ E ' r is a strict subset of £p a . Moreover, 
Ep n contains all those £ E ' r for which (£", V) is also a point of contact. 

3. Nonequivalence. If (E, V) does not belonging to C, then £ E)V is disjoint from £p n . In 
fact, £ E)V is disjoint from all canonical equilibrium sets. 

The proofs of these results can be constructed easily using the same techniques as in the 
proofs of Theorems 4 and 5. We therefore leave the necessary demonstrations to the reader. 
We give a complete discussion of these results in a more general setting in our paper [ 20] , 



where we state and prove the corresponding results without the simplifying assumption that 
S is differentiable. Experience with numerical solutions of these variational problems of this 
kind, however, strongly suggests that the differentiability assumption is essentially always 
satisfied. These computations also show that the parameter regime of nonequivalence can 
be quite wide and can contain many physically interesting equilibrium flows. In Section 6, 
we present a computed example that illustrates this behavior. 



5 Nonlinear stability 

In either the canonical or the microcanonical model, the equilibrium macrostates determine 
steady mean flows that are the most probable flows compatible with the given parameters 
of the model. This statistical property of the mean flows can be interpreted as a stabil- 
ity property in a weak sense. That is, while the underlying ergodic dynamics continually 
produces unsteady perturbations in the microstate, the coarse-grained macrostate remains 
near the mean flow with very high probability. In other words, the construction of the 
steady mean flows as statistical equilibrium macrostates guarantees that they are stable 
with respect to perturbations on the microscopic scales. We now inquire whether these 
steady mean states are also stable in a strong sense with respect to macroscopic pertur- 
bations. Precisely, we investigate the evolution under ideal dynamics of any perturbed 
macroscopic state q(t) that initially lies within a small, finite distance \\q{0) — q\\ in L 2 (X) 
of an equilibrium macrostate q. 

In the canonical model, we find that the most probable state q for any (3 and 7 satisfies 
the celebrated Arnold stability criteria, the canonical information functional J^ i7 being 
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the required Lyapunov functional. We collect these results in Subsection 5.1. In the 
microcanonical model, on the other hand, we encounter a gap in the classical stability 
criteria in the sense that there are microcanonical equilibria which are stable, but for 
which Ip tJ does not satisfy the conditions needed in the Lyapunov stability argument. In 
Subsection 5.2, we therefore devise a more refined argument based on a penalization of this 
functional and thereby fill the gap in the known stability theorems. 

5.1 Arnold stability theorems. The equilibria for the canonical model correspond to 
steady flows that satisfy the nonlinear stability criteria of Arnold [|l], |33| . In this subsection 
we reformulate these classical results in the context of the statistical equilibrium theory. 

Throughout this discussion we assume that for given values of the canonical parameters 
(3 and 7, the equilibrium state q G £g )7 is an isolated, nondegenerate minimizer of canonical 
information functional Ig )7 ; otherwise, the stability of a single equilibrium state q cannot 
be expected. The fact that q is a minimizer of Ip n over L 2 (X) guarantees that the second 
variation £ 2 /g i7 (g) appearing in (^Bj) is nonnegative definite. A sufficient condition for q to 
be a nondegenerate minimizer is that 8 2 Ip yl {q) be strictly positive definite. More precisely, 
we say that q is an nondegenerate canonical equilibrium state if 



for all variations 8q G L 2 (X), with a positive constant \i independent of Sq. The optimal 
constant /1 in ([37]) is the smallest eigenvalue of the operator i"(q) + /3G, where G is the 
Green operator (|22"D . This fact is immediate from the identity 



In this upper bound it suffices to take the constant v = m&xi"(q) + \/3\/Xi, where Ai > is 
the smallest eigenvalue of — A + r~ 2 ; the required bound on i"(q) = 1/ f"(— (3^ — 7) follows 
easily from the fact that f"{rj) equals the variance of the distribution ^(r?) -1 e m p(dy), which 
is bounded below by a positive constant uniformly for rj in a bounded interval. 

The nonlinear stability result for the canonical model is summarized in the following 
theorem. 

Theorem 6. If q G £# j7 is a nondegenerate canonical equilibrium state, then the corre- 
sponding steady flow is stable; specifically, if q(t) denotes the solution to flU) and if ||g(0)— g|| 
is sufficiently small, then for all time t > 




(37) 



5 2 I^(q) = 5 2 {I + (3H + 1 Y){q) = [ [f(q)(5q) 2 + (3 SqG Sq]dx 

Jx 



An upper bound that complements the lower bound (|37|) also holds, namely, 




(38) 



q(t)-q\\ < c||g(0)-g 
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for some finite constant c. 

Proof. The proof relies on the fact that Ip n is a conserved quantity for the dynamics 
(|l|). The conservation of H and C is immediate, since they are rugged invariants. The 
information functional / is also an invariant under the ideal dynamics that governs q(t), 
since it coincides with a certain generalized enstrophy integral ((^) under the identification 
a = i. We claim that the invariant Ip~ satisfies 

^\\q-q\\ 2 < IpM < 2v\\q-q\\ 2 . 

for all q in a small L 2 -neigborhood of q. These estimates follow from the upper and lower 
bounds on the second variation 5 2 Ip a (q) given in Q37| ) and (B7p, in view of the fact that 
^3,7(9) = an d $Ip,i{q) — 0. The derivation makes use of a standard estimation of the 
remainder terms in the second-order Taylor expansion of the smooth functional Ip a about 
q. Then, the usual Lyapunov argument yields 

flk(*)-g|| 2 < ipMt)) 

= Ip,M Q )) < 2z/||g(0)-g|| 2 

for all t > 0, thereby proving the theorem. 

We remark that this proof of Lyapunov stability requires only that Ip n (q{t)) < lp^(q(0)) 
for t > 0. This observation allows us to make a connection with the Turkington model ||49|| , 
which is based on an argument that only inequalities on convex generalized enstrophies 
constrain the ideal dynamics. Even though / is treated as a fragile invariant in that 
framework, the nonlinear stability of q remains valid, since H and C are rugged invariants. 

In the context of the statistical equilibrium theory, the classical stability criteria amount 
to sufficient conditions for the nondegeneracy of the minimizer q. For positive temperature 
states ((3 > 0), the so-called first Arnold theorem applies, while for negative temperature 
states {(3 < 0), the so-called second Arnold theorem applies In either case the 

sufficient condition for stability is that the bounded, symmetric operator i"(q) + (3G be 
positive definite. This form of the stability condition can be translated into the familiar 
form used in deterministic studies of steady flows by means of the formula 

dq (3 

which follows from the mean-field equation (|2"5|) and the fact that /' and i' are inverse 
functions. In this form, the first Arnold theorem applies when dq/dip < 0, while the 
second Arnold theorem applies when < dq/difi < X\ , where Ai is the smallest eigenvalue 
of —A + r~ 2 . If a deterministic steady flow corresponding to a potential vorticity field q is 
submitted to these stability criteria, there often are instances when neither the first nor the 
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second theorem applies; these steady flows correspond to critical points for Jg i7 at which its 
second- variation is negative in some direction. By constrast, any nondegenerate canonical 
equilibrium state q satisfies these criteria, the first when (3 > and the second when (3 < 0. 
Thus, apart from degeneracies such as occur at phase transitions, the statistical equilibrium 
theory always produces mean flows that are both steady and stable. 

5.2 Refined stability theorems. When a microcanonical equilibrium q G S E,r does not 
lie in any canonical equilibrium set £g i7 , the stability results of the preceding subsection 
do not apply. Nevertheless, every nondegenerate equilibrium state for the microcanonical 
model determines a stable flow, as we now show by giving a more refined nonlinear stability 
analysis. 

Again, we assume that q G S E,r is the isolated, nondegenerate minimizer of the micro- 
canonical information / at given microcanonical constraint values E and T. In the micro- 
canonical model, however, the second-order conditions at a constrained minimizer q are sub- 
ject to side-conditions on Sq, which are the linearization of the constraints H = E, C = T. 
Precisely, we say that q is a nondegenerate microcanonical equilibrium state if (|37|) holds 
for all Sq G L 2 (X) that satisfy the linearized constraints (0), with a positive constant fi 
independent of these Sq. The complementary upper bound (p8|) also holds at the micro- 
canonical equilibrium q, with a constant v determined as in the canonical model; in fact, 
the upper bound also holds for Sq not satisfying the side-conditions fl3ip. 

Our strategy for proving the stability of q is to construct a Lyapunov functional in the 
form 

L E J{q) = I(q) + S(E, T) + [H(q) - E] + 7 [C(q) - V] (39) 

+ ^[H{q)~Ef + T -[C{q)-T]\ 

where (3 and 7 are the Lagrange multipliers for the energy and circulations constraints, 
respectively, and a and r are sufficiently large positive constants. The terms in (|39D scaled 
by a and r penalize departures from the microcanonical constraints and thereby capture 
the microcanonical conditioning in the Lyapunov functional. Moreover, these terms do not 
change the value of the Lyapunov functional or its first variation at q, which are 

L E f(q) = 0, SL E ?(q) = S(I + {3H + jC)(q) = 0. 

For this reason, it is possible to choose finite constants a and r so that L E ^ has a nonde- 
generate, unconstrained minimum at q. In this sense L E ^ is identical to the "augmented 
Lagrangian" often used in numerical methods of constrained optimization 0, 38 |. 



In the case of full equivalence, when the concavity condition holds as a strict 
inequality for all (E',T r ) 7^ (E,T), the penalizing terms are unnecessary, because L E q 
coincides with 7g i7 and hence furnishes a Lyapunov functional at q. Indeed, the argument 



used to prove part (a) of Theorem 4 applies to this situation, and guarantees that L E Q(q) > 
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Looiq) = for all q ^ q. In the cases of nonequivalence or partial equivalence, however, 
when the microcanonical equilibrium q may not be contained in the corresponding canonical 
equilibrium set, Lq q may not be a Lyapunov functional at q. In those cases, S 2 L e 'q may be 
negative for variations 5q that are not tangential to the constraint manifold H = E, C = T. 
In general, it is therefore necessary to include penalization parameters a and r so that 
S 2 L E ^ (q) is positive definite. An explicit calculation of this second- variation, namely, 

5 2 L E f{q) = 5 2 {I + f3H + 1 C){q) + a U $8qdx^ + tU Sqdx^ , (40) 

suggests that it is indeed positive definite on arbitrary variations 5q when a and r are 
sufficiently large. 

The nonlinear stability result for the microcanonical model is the content of the following 
theorem. 



Theorem 7. If q G £ E,V is a nondegenerate microcanonical equilibrium state, then the 
corresponding steady flow is stable; specifically, if q(t) denotes the solution to ([!]) and if 
|| ?(0) — q\\ is sufficiently small, then for all time t > 

< c||g(0)-g|| 

for some finite constant c. 



Proof. The crux of the proof is to demonstrate that the second variation of L E ^ (q) is 
strictly positive definite when a and r as fixed large enough. This analysis makes use of 
the bilinear form associated with the operator i"{q) + /3G, which we denote by 

D 2 (z u z 2 ) = I [i"(q) Zl z 2 + p Zl Gz 2 ]dx (z 1} z 2 eL 2 (X)). (41) 
J x 

^From the identity in (|26|) , it is clear that D 2 (5q, 5q) = 5 2 (I + (3 H + ^C)(q). Also, we let 
(zi, z 2 ) = j Z\z 2 dx denote the inner product on L 2 (X). 

We decompose any variation 5q G L 2 (X) into a part <5g" tangent to the microcanonical 
manifold at q, and a part 5q ± orthogonal to it; that is, 

5q = 5g" + 5q ± , 

where 5q ± = £ip + 77I for some £,77 G R, and (5q^,ip) = 0, (5q", 1) = 0. It is easy to verify 
that the functions ip and 1 are linearly independent, given that E 7^ in the microcanonical 
energy constraint. Thus, the components £ and r\ are uniquely determined by 5q, in that 
they solve the associated normal equations. A straightforward analysis then shows that 
the inequality 

(^5q) 2 (l,6g) 2 (^5q^ 2 (l,^) 2 ±2 

- + in 112 = M..III9 + in n2 ^ o\m II 
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holds for a positive constant 9 depending on the angle between ip and 1 in L 2 (X). 
We now substitute this decomposition into ( f40D and analyze the resulting terms: 



D 2 (6qW,5qW) + 2D 2 (SqKSq J 



A 



, Sq 



(42) 



_L\2 



+ r(l, 



The nondegeneracy hypothesis ensures that 

D 2 (5qK5qW) > fi\\5qH 2 . 
On the other hand, the upper bound ( |38|) gives 

\D 2 (5q ± ,5q ± )\ < u\\8q X \\ 2 



(43) 



(44) 

In a similar fashion the cross term is estimated by means of the Cauchy inequality, giving 



2\D 2 {8q\8q' 



< ve 



H 2 + -\\Sq A 
e 



(45) 



for any e > 0. When we use (0), (f44|), and ( 45 ) to estimate the various terms in (42), we 
obtain the following lower bound: 



S 2 L E J(q) > p\\6q 



llll 2 _ 



ve 



ll||2 _ _ 



V 



_L II 2 



+ *$,5q ± ) 2 + r(l,V) 2 - 

We therefore choose e = \ij1v to make the terms in <5g" definite. Then, in order to make 
the terms in 8q L definite, we seek a and r so that 



a^^q 1 ) 2 +r(l,Sq 1 ) 2 > 



a v 

— H V v 

2 e . 



It suffices to set these penalization parameters so that cr#||?/>|| 2 and t6*||1|| 2 equal the 
common value fi/2 + vje + v. With this choice, we obtain the desired lower bound: 



S 2 L^(q) > |||5g"f + |||5g J 



\\Sq\\' 



(46) 



Thus, 5 2 L^(q) is strictly positive definite, and hence we conclude that for all q in a 
sufficiently small neigborhood of q, 



fji\\q - q 



~-" 2 < L^(q) < v\\q-q\\ 2 



for some < fx < v < oo. The usual Lyapunov stability argument therefore ensues, since 
L^F is a conserved quantity for the dynamics. Thus, the proof of the theorem is complete. 
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As in the canonical model, we note that this Lyapunov stability argument remains valid 
when the objective functional I is treated as a fragile invariant, since constraint functionals 
H and C are rugged invariants. 

We conclude this discussion of stability with some remarks about the role of the pe- 
nalization in the Lyapunov functional L^f and its relation to the microcanonical entropy 
S(E,T). For the sake of definiteness, let us suppose that S(E,T) is smooth (C 2 ) on its 
domain A, and let us consider a constraint pair (E, T) that does not belong to C, the 
concavity set. Then, according to the results in Section 4, the microcanonical equilibrium 
macrostate q for (E, T) does not belong to any canonical equilibrium set, and the tangent 
plane to S at (E,T) is not a supporting plane, meaning that (|34D is violated for some 
(E',T'). Nevertheless, it is possible to choose constants cr and r so that they define a 
supporting paraboloid to S at (E, T), in the sense that 

S{E',T') < S(E,T)+P(E'-E)+ 1 (T'-T) + ^(E'-E) 2 + ^(T'-T) 2 

for all (E',F') in A, with equality only when (E',F') = (E, T). It follows that L^(q) > 
Lf'f (q) = for all q ^ q, by an argument analogous to that used in the proof of Theorem 4. 
Thus, we see that the minimal choice of the penalization constants a and r is determined 
by the condition that, at least locally, the corresponding paraboloid lies above the function 
S and contacts it only at (E, T). 



6 Numerical example 

6.1 Barotropic flow over topography in a zonal channel. For the purposes of 
illustrating the general results obtained in Sections 4 and 5, we now present a family 
of computed solutions to the microcanonical variational principle for a particular choice of 
domain, topography and prior distribution. We especially focus on the shape of the S(E, T) 
surface, since it determines whether the corresponding canonical model is equivalent and 
whether the Arnold stability criteria apply to the equilibrium states. In view of our results 
in Section 4 showing that all canonical equilibria are included among the microcanonical 
equilibria, there is no need to implement a solver for the corresponding canonical variational 
principle. 

We take the domain to be a unit square X = { —0.5 < x 1 < 0.5, —0.5 < x 2 < 0.5 }, 
which represents a normalized zonal channel. For the topography term b in the potential 
vorticity expression (H), we choose a simple sinusoid, b = b(x 2 ) = B 2 sin(27rx 2 ). This 
topography is zonal, being independent of xi, and consists of the second harmonic with 
respect to x 2 . 

Such a zonal domain and topography can be viewed as an idealized and simplified model 
of a zone-belt domain in a Jovian atmosphere |26, 16, 34]. In the 1-1/2-layer model, b is the 



effective topography that results from an underlying steady mean flow in a deep lower layer. 
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The domain is composed of a zone, where b is positive, and a belt, where b is negative. 
If the amplitude B 2 of the topography is large enough, one expects that the mean flow 
in the shallow upper layer will a shear flow v = (vi(x2), 0), and that it will tend to be 
anticyclonic (negative vorticity) in the zone and cyclonic (positive vorticity) in the belt. In 
our computations of most probable flows we set B 2 — 1, and we find that they are zonal 
shear flows with the expected topography-induced tendencies. 

We illustrate the effect of a large or a small radius of deformation by choosing the 
representative values r = oo or r = 0.2. The small deformation radius regime is the one 



relevant to a Jovian atmosphere 34 



With these choices of the geometrical parameters, the formulation of the model problem 
is complete once we specify a prior distribution p, which determines the probabilistic struc- 
ture of the small-scale potential vorticity Q. We select a family of gamma distributions 
p e (dy) with mean, variance and skewness normalized as follows: 

yp e (dy) = 0, Jy 2 p e (dy) = 1, J y 3 p e (dy) = 2e; 

the variable y runs through the range of Q. For small e, these distributions are close 
to the standard normal distribution, which they approach in the limit as e goes to zero. 
For positive e, they are supported on the interval — e _1 < y < +oo, and they have an 
exponential tail in the positive ^-direction. They are defined explicitly by the probability 
density 

Pe(dy) = er( - e -2) exp(e~ 2 [log(l + ey) - (l + ey)])dy. 

The family of prior distributions p e {dy) have the virtue that their cumulant generating 
functions / e (^), which are defined by (0), can be calculated explicitly; namely, 

f e (ri) = -e~\ - e- 2 log(l-e77) (47) 
= 7/72 + er/ 3 /3 + 0(e 2 ) . 

The associated information functional I e , defined in flTop , is then determined by the conju- 
gate function i e (y) to f e ijf)] namely, 

ie{y) = t^y - e _2 log(l + ey) . 

The relevant properties of the convex function i e (y) are easily seen from its second deriva- 
tive, i»(y) = (1 + ey)~ 2 . 

The mean-field equation (|25f) corresponding to this choice of prior distribution is 



-A^ + r~ 2 i) + £ 2 sin27nr 2 = e" 1 ([1 - e(-f3ip - 7)]- 1 - l) (4£ 

= R^- 7 ) + e(-/#-7) 2 + 0(e 2 ). 
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^From the above expansion it is evident that e determines the magnitude of the principal 
nonlinear term in this equation. When e = 0, the models resemble the so-called energy- 
enstrophy theory, in which the statistical equilibrium distributions are Gaussian and the 
mean-field equations are linear 0, [K], || |32[. For the sake of definiteness, we fix e = 0.1 
in the computations to follow. It is worth noting that e links the skewness of the prior 
distribution to the nonlinearity of the mean-field equation. 

While many reasonable choices of prior distribution suffice for the purposes of the 
present example, the relation between the potential vorticity and streamfunction in fl48l) is 
distinguished by the fact that it agrees with the form of the relation inferred by an analysis 
of observed zonal winds on Jupiter [|T6 1 . We note however that this physically interesting 



prior distribution violates the growth condition ([11]) assumed for simplicity in our discussion 
of the general theory. Nevertheless, all of the key results described in the preceding sections 
remain valid for this prior distribution, although their proofs are somewhat more involved. 
In particular, the basic large deviation principle given in Theorem 1 continues to hold; by 
virtue of the Gartner- Ellis Theorem [0, it is sufficient that f e (v) is finite and smooth on 
the interval — oo < rj < e _1 . We omit the analysis that justifies this extension of the theory 
already developed. 

6.2 Computed results. To solve the variational principle for the microcanonical model 
derived in Theorem 3, we implement the iterative algorithm developed in [5(J and extended 
in [|T5 1 . Specifically, for given admissible values E and T of the energy and circulation 



constraints, respectively, we compute the equilibrium macrostate q that solves 

minimize I e (q) subject to H(q) = E, C(q) = T. 

^From an initial guess q° having the given constaint values (E, T), this algorithm defines a 
sequence q k of approximations that converges to a solution q as k — > oo. At each iteration, 
a variational subproblem defined by linearizing the energy constraint is solved; its solution, 
q k , then satisfies H(q k ) > E, C(q k ) = T, and I(q k ) < I{q k ~ l ). These properties of the 



iteration step guarantee that the algorithm is globally convergent IpOfl . In the limit as 
k — > oo, the equality constraint on energy is retrieved, and the iterative multipliers, (3 k 
and 7 fc , which are determined along with q k , converge to the multipliers (3 and 7 associated 
with q. Experience with this algorithm in a wide range of statistical equilibrium problems 
has shown it to be an efficient and robust method. 

We now turn to a description of the computed results for this specific microcanonical 
equilibrium problem. 

We compute the equilibrium states q = q(x2',E,T) over the range of constraint values, 
< E < 0.1, —2 < r < 2, for both (a) r = 00 and (b) r = 0.2. In each case, we tabulate 
the microcanonical entropy S(E,T) = —I(q). In Figure 1, we exhibit the admissible set A 
and the concavity set C for these two values of r. We recall from Section 4 that A is the set 
of all pairs (E, T) for which there exists some macrostate q realizing those constraint values 
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(E,T), and that C is the subset of all pairs (E,T) at which S has a supporting plane. In 
Figure 1, C is indicated by "equivalence" and «4\C by "nonequivalence." The remarkable 
result contained in Figure 1 is that, for both r = oo and r = 0.2, the concavity set C is 
a relatively small subset of the admissible set A. In fact, for any fixed circulation T, the 
pair (E, T) lies in C only for a limited range of energies near the smallest admissible energy. 
For all the energies outside this range, the tangent plane to S at (E, T) is not a supporting 
plane for S. Consequently, for this range of larger energies, the equivalence of ensembles 
breaks down, meaning that the canonical model omits all these microcanonical equilibrium 
states. 

Another graphical depiction of the nonconcavity present in S(E, T) is given in Figures 
2 and 3. In Figure 2 the section of S versus E at the fixed value T = is plotted. This 
entropy-energy curve shows that the inverse temperature j3 = dS/ BE is positive only for a 
small range of low energies below E = 0.01. Throughout the negative temperature range, 
the entropy function is slightly concave with respect to E, becoming asymptotically linear 
for high energy values. By contrast, the section of S versus r at E — 0.05 plotted in 
Figure 3 shows that the entropy-circulation curve is strongly nonconcave for a wide range 
of circulation values around zero. This result suggests that in this particular problem the 
nonequivalence of ensembles is largely a consequence of the circulation constraint. 

Figures 1, 2 and 3 also indicate the dependence of the solutions on the radius of defor- 
mation r. The nonequivalence set A\C broadens noticably as r is decreased from r = oo 
to r = 0.2. Also, the asymmetry in the entropy-circulation curve, which is a consequence 
of the skewness 2e of the prior distribution, increases with decreasing r. These two results 
suggest that the effect of the nonlinearity, as measured by e, is strengthened by a small 
deformation radius. From this behavior we conclude that the breakdown of the equivalence 
of ensembles is exacerbated by a weak vertical stratification, which results in a small r. 
This conclusion is especially interesting in the application to the Jovian atmosphere, where 
large-scale mean flows such as the permanent zonal winds typically span several radii of 



deformation 34 



Finally, in Figure 4 we display the mean velocity fields associated with some represen- 
tative microcanonical equilibrium states. Specifically, we fix r = 0.2 and E = 0.05, and 
we choose three representative values of the circulation: (a) T = —0.5; (b) T = 1.4; (c) 
T = 2.0. Flow (a) lies within the nonequivalance set, near the local minimum of S with 
respect to T; flow (b) lies near the equivalence-nonequivalence boundary, which itself is near 
the global maximum of S with respect to T; flow (c) lies in the equivalence set. We draw 
particular attention to flow (a), which closely resembles the mean zonal winds observed in 
a zone-belt domain of the Jovian atmosphere. Indeed, this shear flow consists of a strong 
westward jet that resides between two strong eastward jets. Furthermore, even though 
the prior distribution has a positive (cyclonic) skewness, this intense triple-jet flow has a 
negative (anticyclonic) circulation. Interestingly, most of the coherent structures observed 
on Jupiter and the other giant planets are anticyclonic. These general properties on flow 
(a), which is representative of the most probable flows in the nonequivalence set, are not 



31 



shared by flows (b) and (c). Instead, each of these flows consists of one broad westward jet 
and one narrow eastward jet, and the total circulation of each of them is positive (cyclonic). 
These weaker shear flow structures are typical of the equivalence set. 

Perhaps our most significant result is that the most probable flows corresponding to a 
constraint pairs (E.T) in the nonequivalence set are nonlinearly stable, even though they 
typically fail to satisfy the well-known stability conditions. The computed flows discussed 
above illustrate this general result quite vividly. The most probable flows (a) and (b) 
displayed in Figure 4 have negative j3 and fail the often-quoted sufficient condition 

< £ < Ax (49) 
dip 

for the second Arnold stability theorem. Indeed, our computations show that for the triple- 
jet flow (a), dq/dijj ranges from 27 and 78, while for the qualitatively different flow (b), 
dq/dip ranges from 26 to 42. Since Ai = 7r 2 + r~ 2 »s 35, we conclude that flow (a), which 
lies within the nonequivalence set, is far from satisfying (f4~9|) , while flow (b), which lies near 
the equivalence-nonequivalence boundary, comes closer to fulfilling fl4"9|) . 

By contrast, flow (c) in Figure 4, which is a positive temperature macrostate lying in 
the equivalence set, satisfies the Rayleigh condition 

which implies the first Arnold stability theorem. In fact, for flow (c), dq/dip is approxi- 
mately equal to the constant —5 over the domain. 

Let us comment further on this gap in the classical stability criteria. First, any mi- 
crocanonical equilibrium q, which corresponds to a constraint pair (E,T) belonging to the 
equivalence set C, is a global minimizer of the associated information functional Jg )7 . Thus, 
in principle, the classical Arnold stability criterion applies, assuming only that the min- 
imizer is nondegenerate. Nevertheless, it is possible that the explicit sufficient condition 
(|49|) may be violated, even though is a Lyapunov functional for q. Second, in the case 
of an equilibrium q which lies slightly outside the equivalence set, it is possible that the 
Ig j7 is a Lyapunov functional, if the microcanonical entropy S(E,T) is locally concave at 
q; then, the tangent plane corresponding to (/3, 7) is locally a supporting plane for S, even 
though it does not support S globally. Typically, the sufficient condition fl49|) is too crude 
in such a delicate case. Third, for an equilibrium q which lies far outside the equivalence 
set, Jg )7 is not definite at q, and the classical Lyapunov argument based on this functional 
fails. Of course, in this nonequivalent case the sufficient condition (|49| ) is violated. 

The above analysis of the various cases possible in the classical stability criteria notwith- 
standing, Theorem 7 guarantees that the microcanonical equilibrium states corresponding 
to all admissible pairs (E, T) define nonlinearly stable flows, provided only that a technical 
nondegeneracy condition is fulfilled. Given this refined stability result, which makes use of 
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the penalized Lyapunov functional L^f, it is not necessary to impose a restrictive condi- 
tion such as ( |49| ) to obtain the stability of most probable flows. Conversely, it is incorrect 
to assume that a steady flow that strongly violates the well-known Arnold conditions is 
unstable. In essence, these conditions are derived by utilizing a linear combination of two 
independent conserved quantities (the energy and a certain enstrophy), while the conser- 
vation of each of these quantities separately constraints the evolution of perturbations and 
leads to more refined stability conditions. 
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Figure captions 



Fig. 1. Admissible set A and concavity set C for the microcanonical variational principle 
for a range of constraint values on energy (0.01 < E < 0.1) and circulation (—2 < V < 2). 
The computed boundary of the admissible set is the dashed curve; the computed boundary 
of the concavity, or equivalence, set is the solid curve. For each admissible constraint 
pair (E, T) in a grid over this range with AE = 0.0025 and Ar = 0.025, the corresponding 
equilibrium macrostate q, multipliers f3 and 7, and microcanonical entropy S are computed. 
A pair (E, T) is accepted for the concavity set if the tangent plane at q with slopes (3 and 
7 lies above the function S throughout the admissible set. This computation is displayed 
for two different choices of deformation radius: (a) r = 00 and (b) r = 0.2. 

Fig. 2. The section S(E, 0) of the microcanonical entropy for the same variational problem 
as in Figure 1. The solid curve is for (a) r = 00, and the dashed curve is for (b) r = 0.2. 

Fig. 3. The section S'(0.05, T) of the microcanonical entropy for the same variational 
problem as in Figure 1. The solid curve is for (a) r = 00, and the dashed curve is for (b) 
r = 0.2. 

Fig. 4. Mean velocity fields of the zonal shear flows determined by the most probable 
macrostates for the microcanonical model with r = 0.2 and E = 0.05. Flows corresponding 
to three different circulations are displayed: (a) T = —0.5, which lies within the nonequiv- 
alence set; (b) T = 1.4, which lies near the equivalence-nonequivalence boundary; (c) 
T = 2.0, which lies in the equivalence set. 
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